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Abstract — The function of the organism hinges on the per- 
formance of its information-processing networks, which convey 
information via molecular recognition. Many paths within these 
networks utilize molecular codebooks, such as the genetic code, 
to translate information written in one class of molecules into 
another molecular "language" . The present paper examines the 
emergence and evolution of molecular codes in terms of rate- 
distortion theory and reviews recent results of this approach. 

We discuss how the biological problem of maximizing the 
fitness of an organism by optimizing its molecular coding ma- 
chinery is equivalent to the communication engineering problem 
of designing an optimal information channel. The fitness of a 
molecular code takes into account the interplay between the 
quality of the channel and the cost of resources which the 
organism needs to invest in its construction and maintenance. We 
analyze the dynamics of a population of organisms that compete 
according to the fitness of their codes. The model suggests a 
generic mechanism for the emergence of molecular codes as a 
phase transition in an information channel. This mechanism is 
put into biological context and demonstrated in a simple example. 

Index Terms — Molecular codes, rate-distortion theory, biolog- 
ical information networks, molecular recognition. 

I. Introduction 

Molecules are the carriers of information in the living cell. 
Myriad fluxes of molecular information are produced by the 
cell's biochemical networks. Other information fluxes enter 
the cell from the outside environment. All this information 
is read, integrated and further processed by the circuitry of 
the cell, which computes the cell's response to this input. 
This computation often includes the translation of molecular 
information written in one class of molecules into another 
class of molecules. For example, genes written in the language 
of DNA are translated into the language of amino-acids. 
The translation requires a molecular code, in this example, 
the genetic code (T). This paper presents and reviews an 
information-theoretic approach EJ— O (or equivalently, a sta- 



tistical mechanics approach) to the biological question: How 
do molecular codes emerge and evolve? 

Constructing reliable coding machinery is a challenge to 
the organism, since this machinery must rely on molecular 
recognition interactions which take place in the noisy, crowded 
milieu of the cell. The typical binding energies are not much 
larger than the energy scale of thermal fluctuations, fc^T, 
rendering molecular recognition inherently prone to noise. 
Moreover, each molecular recognizer needs to locate its correct 
target within many lookalikes, which further complicates the 
task of recognition. On top of that, the construction of a code 
costs the organism time and resources and the organism has 
to maneuver between the conflicting needs for low cost and 
high reliability. 

To discuss how the interplay between quality and cost 
determines the fitness of a molecular code, we describe the 
code in terms of an information channel or a mapping, which 
relates two sets of molecules via recognition interactions. 
One may think of these two sets as molecular "symbols" 
and their possible "meanings". Optimizing molecular codes 
is a multi-scale task: On the small scale, the accuracy of 
each recognition event must be maximized (for further details 
see the conference paper by Y. Savir and 0-0). On the 
large scale - which is in the focus of the present paper - a 
fitter molecular code should assign meanings to symbols in 
a manner that reduces the impact of recognition errors, as 
measured by the distortion function. 

The need to improve its error-resilience drives the coding 
machinery to maximal accuracy. However, accurate recogni- 
tion also requires highly specific binding. We show that the 
cost of this chemical specificity is equivalent to the rate of 
the molecular information channel, i.e. the mutual information 
between the symbols and their meanings. The overall fitness 
of the code is a combination of the rate and the distortion. 



As evolution varies the control parameter that measures the 
relative significance of the rate and the distortion components 
of the fitness, the organism may reach a point where it becomes 
beneficial to invest resources in specificity in order to convey 
information through the channel. At this point - which is 
equivalent to a supercritical phase transition in a statistical 
mechanics system - a molecular code emerges. 

The rest of this paper is organized as follows. In Section HU 
we define the molecular information channel and the related 
cost, quality and fitness functions are defined in Section [HI] 
In Section IIVI we derive the critical point, which describes 
the emergence of the molecular code. We examine a simple 
example for this generic scenario and discuss it in several 
regimes of population dynamics. In Section [V] we conclude 
by discussing the effect of topology of the symbol space on 
the emergence of the code. 

II. Molecular codes as information channels 

Let us consider a molecular code as a mapping between two 
abstract chemical spaces that contain the two sets of molecules 
to be related by the code. One may refer to these sets as 
molecular symbols and their respective meanings. Perhaps the 
best known example is the genetic code Q], (2), in which 
the symbols are the 64 codons and the meanings are the 20 
amino-acids and the "stop" signal . A much larger molecular 
coding system, with thousands of symbols and meanings, is 
the transcription regulatory network. In this case, the DNA 
binding sites are the symbols and their potential meanings are 
the transcription factors that bind the sites [ 10], [11]. 

It is evident from the terminology of symbols and meanings 
that the problem of optimizing the quality and cost of a 
molecular code is actually a semantic problem: One has 
to assign meanings to symbols in an optimal manner that 
maximizes quality while minimizing the cost. To discuss this 
semantic problem, we consider a two-way information channel 
that relates the symbols space S with its n s symbols, i,j, k... 
and the meanings space M with its n m meanings, a, /3,7-.. 
(Fig. Q]). The channel is 'two-way' since it describes how 
meanings are encoded and stored in memory as molecular 
symbols (the A4 — > S direction), and how the symbols are 
read and then decoded to reconstruct the meaning (the S — > M. 
direction). For a simplified 'one-way' formulation see Q. 

The information channel relies on error-prone molecular 
recognition and is therefore modeled as a three-stage Markov 
chain of stochastic processes lfl2l - |[T6l : (?) The representation 
of meanings as symbols is described by the stochastic encoder 
matrix e : M. — > S. The matrix element e Q ; is the probability 
that a meaning a is encoded by a symbol i (each row obeys 
probability conservation ^\ e a i — 1). (ii) Next, the symbol is 
read. This process is described by the reader matrix r : S — > S. 
The matrix element r^ is the probability to read the symbol 
i as j, which accounts for possible misreading errors. The 
diagonal elements are the probabilities to correctly read the 
symbols (^ 
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1). (Hi) Finally, the read symbol is decoded 
according to the decoder matrix d : S — > Ai. The matrix 
element dj U is the probability that the symbol j is interpreted 



Fig. I . Molecular codes as noisy information channels. A molecular code 
is a mapping that relates the space M of molecular "meanings", a,/3...ui 
(left), with the space S of molecular "symbols", i,j,k... (right). The noisy 
communication channel is a three-stage Markov process (solid arrows), where 
each stage is described its own stochastic matrix (see text): (i) A meaning, 
say a, is encoded as a symbol i by the encoder matrix e a i- (ii) The symbol 
i is read as j by the reader matrix rjj. (in) The symbol j is decoded as u> 
by the decoder matrix dj^. The "distance" between the original meaning a 
and the reconstructed meaning lu is given by the matrix element c aLJ (dashed 
arrow). The distortion D (TJ is the average distance (cau) along all possible 
paths a — > i — > j — ¥ U), The cost / (2} is the mutual information between 
M and S. The linear combination of D and / is the fitness H {4j. 



as carrying a meaning cj (XL dju = !)• The original meaning 
a passes the three stages of the channel and returns as a 
reconstructed meaning w. The effect of errors in the channel 
is measured by the distance c auj . Returning to the example of 
the genetic code, the channel encodes amino-acids as DNA 
codons. These codons are in turn read by the anti-codons of 
the tRNA. At its other side, the tRNA is charged with an 
amino-acid which is the decoded meaning at the output of 
the channel. This decoded amino-acid is ligated to the protein 
which is being synthesized by the ribosome Q), 0. 

The channel is defined by the pair of stochastic maps, e 
and d. When the cost of constructing a coding machinery is 
too high, the relation between symbols and meanings is non- 
specific. At this non-coding state, any molecular meaning is 
equally likely to be encoded by any of the available symbols 
and the encoder matrix therefore does not depend on the 
meaning, e a i = m. Similarly, the decoder matrix at the non- 
coding state does not depend on the symbol dj U — fu» where 
/ w is the demand for the meaning u, which accounts for the 
possibility that certain meanings are used more frequently than 
others. As will be discussed below, a code emerges when the 
matrices e and d become non-uniform. The non-uniformity 
signifies preference for binding between certain molecular 
symbols and meanings. 

III. The fitness of molecular codes 

After we defined molecular codes in terms of noisy informa- 
tion channels, we derive the quality and cost of these channels 
and use them to construct the fitness of the code. 

A. The distortion measures the quality of the code 

A natural way to estimate the quality of a molecular code 
is by examining how well the meaning that is reconstructed 
at its output preserves the original meanings at the input of 
the channel. This is measured by the distortion function D 



Ifl2l . ifTTl . Ifl8l , which is the average distance (c auJ ) along all 
possible paths between original and reconstructed meanings 
( E), H and references therein). Each of the possible paths 
a — > i — > j —y u> is weighted by its probability, P a iju, which 
is the product of the relevant entries in the encoder, reader and 
decoder matrices and the demand f a for the original meaning, 
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The reader matrix r determines the topology of the symbols 
space S. It implies some notion of proximity which may be 
represented as a graph G(r, S) whose nodes are the symbols 
and edges connect symbols that are likely to be confused, i.e., 
have a significant r-y value 10, J5). Similarly, the distance 
matrix c aiLl represents the topology of the meaning space .A/f . 
In the case of the genetic code, for example, the reader tends 
to confuse similar codons that differ by one base only. In 
the meanings space, close-by amino-acids are those that have 
similar chemical characteristics, such as polarity or size. Both 
topologies affect the distortion function |[T). 

An "ideal" perfect reader, r-y = <5y, would enable the 
"organism" to decode as many meanings as there are available 
symbols, since there is no chance to confuse between symbols. 
However, for a realistic reader, which is imperfect due to the 
inherent recognition noise, it is preferable to decode fewer 
meanings and thereby minimize the effect of misreading. 
Moreover, the distortion function drives the preferable codes 
to be smooth, in the sense that symbols that are likely to 
be confused encode similar meanings. In other words, the 
mappings e : M — > S and d : S — > M. tend to be continuous 

EMS), Go), El. 



B. The rate measures the cost of the code 

Molecular codes utilize physicochemical binding interac- 
tions to relate symbols and meanings. The encoder and de- 
coder matrices are, in fact, the binding probabilities between 
molecules from A4 and S. A coding system with high binding 
specificity can accurately read symbols and thereby reduce 
the chance of assigning the wrong meaning due to misreading 
errors. It is evident, however, that highly specific binding also 
costs higher binding energy, which in general necessitates 
larger binding-sites. The cost of replicating, transcribing and 
translating the gene segment that encodes the binding site and 
the cost of keeping this segment free of mutations, are all 
expected to be roughly proportional to the binding site size. 
A reasonable estimate for the cost of the code is therefore the 
average size of the binding sites, which is roughly proportional 
to the average binding energy. 

To estimate the cost, we extract the average binding energy 
from the encoder matrix e. The matrix element e a i is the 
probability that the molecule carrying the meaning a binds 
the molecular symbol i. For example, in the transcription 
regulatory network, a may be one of the transcription factors 
and i is a prospective DNA binding site where a may bind. 



If the binding and unbinding events are fast, they obey the 
Boltzmann equilibrium distribution, e a i ~ expe a i, where the 
binding energy e a i is measured in fc#T units. It follows that 
the binding energy e a i scales like the logarithm of the binding 
probability, e a i ~ \ne a i. As a result, the average size of the 
binding site, and therefore the cost / of the molecular code, are 
proportional to the average logarithm of the encoder matrix, 

I = V* f a e ai In — 
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The reference energies, Si — lnj^ fpexpepi, and the nor- 
malization of e a i by Ui ensure that the cost vanishes when the 
binding is non-specific at the non-coding state, e a i = m. 

The cost (O is nothing else than the mutual information 
between the symbols and the meanings, which is the entropy 
reduction due to the symbol-meaning correlation in the en- 
coder. This is a common measure for the cost of a coding 
system, which measures the average number of bits required 
to encode one meaning, i.e., the rate of information passing 
through the channel. In principle, one would need to consider 
also the bit rate of the decoder, d. However, the optimal 
encoder and decoder are related through the Bayes' theorem 
(see [4 1 and references therein), 
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Relation © expresses the intuitive notion that if the encoded 
meaning w is likely to be read as the symbol j then the symbol 
j tends to be decoded as u>. It also implies that, in practice, 
it is enough to specify only one of the encoder and decoder 
matrices in order to characterize the coding machinery of an 
organism. 

C. The fitness is a rate-distortion functional 

To optimize the molecular coding apparatus, its two de- 
terminants, the cost / and distortion D must be balanced. 
For the sake of simplicity, we express this interplay as the 
maximization of an overall code fitness, which is their linear 
combination 
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The minus signs reflect the fact that while / and D need to be 
minimized, the overall fitness H is driven by evolution towards 
maxima. The coefficient k — —dl/dD is the gain, which 
measures the bits of information required to decrease the 
distortion. The gain k is expected to increase with the richness 
of the environment and the complexity the organism: A more 
complicated environment transmits more signals which require 
heavier computation of the cell's response to these signals. 
Similarly, the circuitry of a complex organism requires higher 
fluxes of information transfer. It is therefore beneficial for this 
organism to pay a larger cost to improve the quality of its code 
and thereby reduce the distortion D, since it gains more from 
such an improvement. 




Fig. 2. A simple 2x2 molecular code. The code maps two meanings, 
M = {a, lo}, to two symbols, S = {i, j}. The demand for the two meanings 
is symmetric f a = / w = ■=. By probability conservation and symmetry, the 
encoder has only one degree of freedom and the order parameter ip ls one- 
dimensional J6). The distance is specified by the parameter eo, the penalty 
for confusing meanings, whereas the reading matrix r is specified by e/2, the 
average misreading probability. The evolution of the 2-by-2 code is graphed 
as a function of the normalized gain k/k c : Plotted are the cost / (blue), 
the distortion D (green), given by 0. Also plotted are the fitness H = 
—D — re -1 / (red), which is shifted by co/2, and the order-parameter ip 
(black). At low gain (left) the system is in the non-coding state of uncorrelated 
symbols and meanings (i/> = 0). When k increases above a critical value 
k c = [co(l — e) 2 ] - 1 , the system undergoes a second-order coding transition. 
Following the coding transition the cost / increases, but this is compensated 
by the decreasing distortion D, and thus the overall fitness H increases. The 
parameters are e = 1/5, cq = 1, which yield k c = 25/16 (9j. 



Since the cost or the rate is (up to a factor) the entropy loss 
due to coding, the conjugate parameter k _1 is equivalent to a 
temperature. The distortion D is equivalent to an interaction 
energy. The combination of rate and distortion is the fitness 
function H, a "free energy" which the organism tries to 
maximize by optimizing the channel parameters 0-J6), lfl2ll . 
ifTTl . High "temperatures" or small gains indicate a rising cost 
of binding sites, which drives the encoder and the decoder 
to homogeneity by reducing the specificity of the underlying 
binding interactions. At the other extreme, high gains or low 
"temperatures" drive the coding matrices to a non-random 
inhomogeneous state. The optimal code e* (and d*), the 
one which maximizes the fitness H, is a function of three 
determinants: the reading matrix r, the distance c, and the 
gain k. Below we discuss how a molecular code evolves as 
these determinants are varied. In particular, we show that a 
molecular code emerges at a critical transition in the noisy 
information channel. 

IV. Population dynamics in the code space and the 

EMERGENCE OF MOLECULAR CODES 

A. The code space 

To examine the response of a coding system to changes 
in the control parameters, r, c and k, let us consider a 



population of "organisms", i.e., self-replicating information 
processors that utilize coding systems. The organisms live in 
an environment where they compete according to the fitness 
of their codes. The code of each organism is specified by 
its encoder e and decoder d matrices. However, as discussed 
above, due to Bayes' theorem it suffices to specify only one of 
the matrices, say e. One may therefore describe the evolution 
of this population as the motion of points in a code space 
which is spanned by all possible encoders, < e a i < 1. 
This space is an n m x n,; -dimensional unit hypercube. Each 
axis of the cube corresponds to one entry of the encoder 
e a i. In fact, since the every row a of the encoder satisfies 
probability conservation, ^\ e a % = 1, the effective dimension 
is reduced to n m x (n s — 1). Each organism is represented by 
a point in the cube at a location that corresponds to its code. 
The population is represented by the probability density (or 
the number density) ^(e a i), which is the probability that a 
randomly picked organism has a given code. 

B. The optimal code 

For the sake of simplicity, we first examine large populations 
with negligible mutation rate. Such populations peak very 
sharply around an optimal code e* and can therefore be 
approximated by a delta distribution ^ ~ 5(e — e*). The 
dynamics in this regime may be described by the motion of the 
optimum e* in response to changes in the system parameters, 
r, c and k. 

The optimal code maximizes the overall fitness H (@J. To 
calculate the corresponding encoder e*, one augments H with 
Lagrange multipliers to ensure that the n m probability conser- 
vation relations are satisfied, Ht = H + J2 a ^ a Si e «- The 
optimal encoder code-matrix e* is located at the extremum, 
dH T /de m = , which leads to |fl 

„ _ m exp (-Kflai) 



Y] Uj exp (— nQ a j) 



(5) 



This is a Boltzmann partition with effective "energies" il a i = 
J2j,Lo r ij d j^( 2c ^ - L 7 d j7 c 7") and a "temperature" k~ x . 
Organisms with lower k are "hotter" and their codes are 
noisier. 

Since both sides of © depend on the code matrix e*, 
through (O and the definition of the 51-s, it defines a self- 
consistency relation for e*, which in general requires an 
iterative numerical solution @], J6], (TSJ. At low gains, when 
it is essential for the organism to minimize the cost /, the 
optimal code given by the solution of (0 is completely non- 
specific, e* ai = Ui. At this non-coding state I vanishes (O 
since the encoder conveys no information about the meanings 
(it is a-independent). As we show below, when the gain k 
increases, the code may remain non-specific for some range 
of k. Then, when it surpasses a certain critical value, k c , the 
code undergoes a "coding transition" when it becomes specific. 

C. A simple 2x2 code 

To demonstrate the coding transition, we examine the sim- 
plest non-trivial example of a coding system that maps two 



meanings, M. — {a,uj}, to two symbols, S = {/, j} (Fig. 
[2]). We assume that the demands for a and uj are equal 
fa = fui = i> from which it follows that the usage of 



symbols is also symmetric, Ui 



The encoder e 



has four entries and is constrained to a 2D unit square by the 



two conservation relations, e a i 



1. The 



symmetry of the setting implies that the distance c, the reading 
r, and the encoder e, are all 2-by-2 matrices determined by a 
single degree of freedom, 
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and e = i ( , -, 
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(6) 



The parameter cq measures the average penalty of replacing a 
by u> or ui by a and e/2 is the average misreading probability. 
The deviation of the encoder from the uniform non-coding 
state e a i = h is measured by the order parameter, — 1 < ip < 
1. From (0 we find the decoder d and substitution in (|T][2j 
yields the distortion and the cost, 

D = ±c [l - (1 - efip 2 ] and 

I = i[(l + ^)ln(l+^)Hh(l-V)ln(l-V')] • (7) 

Interestingly, the resulting fitness H = —D — k -1 ! is 
completely analogous, up to a minus sign, to the free energy of 
a mean-field Ising magnet. The distortion D corresponds to the 
spin-spin interaction energy, whereas the cost / corresponds 
to the entropy of the magnet. Within this analogy, the gain 
k is the inverse temperature and the magnetic interaction 
strength J is J = cq(1 — e) 2 . Just like in the magnet, one can 
increase the order and the correlation in the coding system 
by raising the interaction strength J - via increasing cq, or by 
decreasing the error probability e. Given the coding system 
parameters, Co, e, and k, one may locate the order-parameter 
ip* which maximizes H and determines the optimal code e*. 
The optimum may be found by calculating the extremum, 
dH/dt/j = 0, or directly from ©, which yield the familiar 
self-consistency equation of the Ising magnet 



2„/,*l 



ip* = tanh |kc (1 - e) ip 



(8) 



As in the Ising model, it follows from the solution of ([H) 
that the code remains in the random non-coding state, ip* = 0, 
as long as the gain k is below a critical value k c , which is 
equal to the inverse interaction strength, 

K c = J 



[C0(1 



(9) 



At k c , a coding state, ip* ^ 0, emerges at a continuous, 
second-order coding transition. Relation (0 indicates three 
possible pathways to approach the coding transition: (i) im- 
proving the reading accuracy (smaller e), for example by 
increasing the size of the specific binding sites (ii) increasing 
the penalty co of encoding a wrong meaning, and (iii) lowering 
the importance of cost, by increasing the gain k. The first 
two pathways are equivalent to strengthening the magnetic 
interaction while the third one is analogous to lowering the 



"temperature", n . A biological example for a possible 2- 
by-2 molecular code is discussed in 0. 

D. The critical coding transition 

The notion of a coding transition, demonstrated above in 
the simple 2-by-2 code, can be generalized to larger coding 
systems. To locate the coding transition, one examines the 
stability of the non-coding state, e a i — Ui, with respect to 
small variations of the encoder Se a i — e a i — Ui. The order 
parameter 5e a i reflects the preference of the symbol i to 
encode the meaning a relative to the average usage ui. This 
is equivalent to expanding the two-state Ising magnet into an 
ro s -state Potts model. A coding state emerges exactly at the 
point when the order-parameter becomes non-zero, Se a i ^ 0, 
when meanings and symbols become correlated. The coding/ 
non-coding transition takes place when the fitness maximum at 
the non-coding, symmetric state becomes unstable. By analysis 
of the curvature of the fitness landscape H (e) [4], we find that 
the critical gain is 

(10) 
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where A^ is the maximal eigenvalue of the normalized 
distance, C auj = (/ Q / w ) 1/2 (E/3 Ip c P^ + E 7 fi c a~i ~ 
Y^p-ylpf-iCpi ~ c aco), and X* R is the second-largest eigen- 
value of the weighted square of the reader Rij = 

In the case of 2-by-2 code ©, A^ = (1 - e) 2 and AJ, = 
Co/2, which by substitution in (ITOb yield the critical value ©. 
It is interesting to note that the eigenvalue X* R corresponds 
to the smoothest non-uniform eigenvector 5e* ai ^ 0, which 
represents a coding state. This coding eigenvector is the first- 
excited state of the system, which measures the preference of 
meanings to be encoded by specific symbols (see |[2]]-|[6) and 
references therein). 

Our discussion so far assumed that the evolution of the code 
more or less follows the track of the optimal code, e*. How- 
ever, the coding system may get stuck at a metastable, sub- 
optimal state due to the ruggedness of the fitness landscape. 
There may exist, somewhere in the code fitness landscape, a 
superior, global optimum. Nevertheless, reaching this optimum 
is vary hard and requires crossing deep 'valleys' or following 
very intricate pathways. This system may exhibit slow, 'glassy' 
dynamics. In this kind of almost frozen dynamics [19], the 
local landscape is much more important than the location of 
the global optimum. In addition, other effects of population 
dynamics, such as mutations and genetic drift, may drive the 
coding system towards suboptimal states. Mutations broaden 
the population, creating a "quasi-species" with a reduced 
effective fitness. Genetic drift delays the coding transition to 
higher gains (for further details see ]4l-lf6ll). 

V. The topology of the symbol space and the 

COLORING PROBLEM 

As mentioned above, the reader r may be depicted in terms 
of a graph G(r,S), which represents the topology of the 
symbol space S by drawing an edge between every pair of 



symbols that are likely to be confused. The second-largest 
eigenvalue of the reader squared, X R , which corresponds to 
the coding state, bears a special significance: The reader 
r is related to the Laplacian of the symbol space A5 via 
As = I — r, where / is the identity matrix 0-10). Therefore, 
X* R corresponds to the second-smallest eigenvalue A^ of the 
Laplacian, X* R = (1 — X* A ) 2 (in the degenerate 2-by-2 code, 
the second-smallest eigenvalue, A^ = e, is its only available 
excited-state). The Laplacian operator appears naturally in the 
coding problem since it is the operator that describes random 
walk on the symbol graph G(r, S) via misreading events that 
move the molecular reader along edges connecting confused 
symbols. In fact, the eigenvalue A^ is the slowest relaxation 
time-scale of the system. The corresponding eigenvector 8e* ai 
is known to be the smoothest of all excited modes of the graph, 
in accord with the intuitive physical notion that the modes with 
the lowest "energy" eigenvalues and frequencies are those of 
the largest wave-lengths. 

It follows from Courant's theorem that the smooth, first ex- 
cited mode 8e* ai divides the graph into two contiguous positive 
and negative regions [2|, (3]. In the positive region, 8e* ai > 0, 
the symbols will tend to encode certain meanings, whereas in 
the other region the chance to encode these meanings will be 
lower than the average, 8e* ai < 0. Thus, 8e* ai partitions the 
graph with minimal boundaries between regions of opposite 
tendency to encode certain meanings and the emergent code 
is smooth, in the sense that adjacent symbols tend to encode 
similar meanings. This arrangement minimizes the distortion 
D by decreasing the average distance c between meanings 
encoded by adjacent symbols. For example, if a coding system 
has two possible meanings, say 'sea' and 'land', then it is 
clear that the distortion of an arrangement according to the 
lowest-excited mode, where there is one continent and one 
ocean, is much smaller than the distortion of an intricate 
arrangement with many islands, seas, peninsulas and bays. 
Indeed, in the case of the genetic code, all amino acids 
are encoded by synonymous codons arranged in contiguous 
domains except serine that splits into two domains [ 1 1, [2 1. Our 
model concludes that the genetic code is smooth because the 
lowest-excited modes at the transition are the smoothest non- 
uniform modes. Similar continuity is found in the transcription 
regulation network [10|, IfTTI . 

The low modes partition the symbol graph G(r, S) into 
domains, which may be likened to drawing borders between 
countries on a map. We have found that the problem of 
maximizing the fitness of the code by optimizing this partition 
is related to another classical partition problem, the coloring 
problem 0, 0, 0- In the coloring problem, the goal is 
to calculate the minimal number of colors required to color 
an arbitrary map on a surface such that no two bordering 
countries have the same color. This minimal number is termed 
the "coloring number" of the surface and is determined by the 
surface topology. It follows from our model that the topology 
of the code sets the coloring number as an upper limit to 
the number of first excited modes, and thus to the number of 
encoded meanings. The relation of the coloring problem to 



the maximal number of first excited modes has a geometrical 
origin which is discussed in detail in 0, 0. For the genetic 
code, the coloring number estimate is in range of 20 — 25, 
in the neighborhood of the naturally occurring number. In the 
transcription regulation network 1 11 1, the coloring number sets 
bounds that are close to the size of certain transcription factor 
families. 
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